In [1]:
import pandas as pd

df = pd.DataFrame.from_csv('2014-15-player-per-game-averages.csv')

df.head()


Out[1]:
last_name first_name position height_inches weight_lbs min pts fg_pct reb ast blk stl
person_id
203112 Acy Quincy Forward 79 240 18.9 5.9 0.459 4.4 1.0 0.3 0.4
203500 Adams Steven Center 84 255 25.3 7.7 0.544 7.5 0.9 1.2 0.5
201167 Afflalo Arron Guard 77 215 32.1 13.3 0.424 3.2 1.7 0.1 0.5
201582 Ajinca Alexis Center 86 248 14.1 6.5 0.550 4.6 0.7 0.8 0.3
203128 Aldemir Furkan Forward 82 240 13.2 2.3 0.513 4.3 0.7 0.4 0.4

That was a little easier than my own helper class :)

Let's now examine a scatter matrix comparing all combinations of the numerical stats.


In [16]:
import matplotlib.pyplot as plt
from pandas.tools.plotting import scatter_matrix

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

scatter_matrix(
    df[['height_inches', 'weight_lbs', 'min', 'pts', 'reb', 'ast', 'blk']],
    figsize = (12, 12)
)
# plt.savefig('nba-scatter.png')
None